154 research outputs found

    A novel method for scaling iterative solvers: avoiding latency overhead of parallel sparse-matrix vector multiplies

    Get PDF
    Cataloged from PDF version of article.In parallel linear iterative solvers, sparse matrix vector multiplication (SpMxV) incurs irregular point-to-point (P2P) communications, whereas inner product computations incur regular collective communications. These P2P communications cause an additional synchronization point with relatively high message latency costs due to small message sizes. In these solvers, each SpMxV is usually followed by an inner product computation that involves the output vector of SpMxV. Here, we exploit this property to propose a novel parallelization method that avoids the latency costs and synchronization overhead of P2P communications. Our method involves a computational and a communication rearrangement scheme. The computational rearrangement provides an alternative method for forming input vector of SpMxV and allows P2P and collective communications to be performed in a single phase. The communication rearrangement realizes this opportunity by embedding P2P communications into global collective communication operations. The proposed method grants a certain value on the maximum number of messages communicated regardless of the sparsity pattern of the matrix. The downside, however, is the increased message volume and the negligible redundant computation. We favor reducing the message latency costs at the expense of increasing message volume. Yet, we propose two iterative-improvement-based heuristics to alleviate the increase in the volume through one-to-one task-to-processor mapping. Our experiments on two supercomputers, Cray XE6 and IBM BlueGene/Q, up to 2,048 processors show that the proposed parallelization method exhibits superior scalable performance compared to the conventional parallelization method

    Emerging accelerator platforms for data centers

    Get PDF
    CPU and GPU platforms may not be the best options for many emerging compute patterns, which led to a new breed of emerging accelerator platforms. This article gives a comprehensive overview with a focus on commercial platforms

    Guest Editors' Introduction: Hardware Accelerators for Data Centers

    Get PDF
    [No abstract available

    Architectural requirements for energy efficient execution of graph analytics applications

    Get PDF
    Intelligent data analysis has become more important in the last decade especially because of the significant increase in the size and availability of data. In this paper, we focus on the common execution models and characteristics of iterative graph analytics applications. We show that the features that improve work efficiency can lead to significant overheads on existing systems. We identify the opportunities for custom hardware implementation, and outline the desired architectural features for energy efficient computation of graph analytics applications. © 2015 IEEE

    A Template-Based Design Methodology for Graph-Parallel Hardware Accelerators

    Get PDF
    Graph applications have been gaining importance in the last decade due to emerging big data analytics problems such as Web graphs, social networks, and biological networks. For these applications, traditional CPU and GPU architectures suffer in terms of performance and power consumption due to irregular communications, random memory accesses, and load balancing problems. It has been shown that specialized hardware accelerators can achieve much better power and energy efficiency compared to the general purpose CPUs and GPUs. In this paper, we present a template-based methodology specifically targeted for hardware accelerator design of big-data graph applications. Important architectural features that are key for energy efficient execution are implemented in a common template. The proposed template-based methodology is used to design hardware accelerators for different graph applications with little effort. Compared to an application-specific high-level synthesis methodology, we show that the proposed methodology can generate hardware accelerators with up to 18× better energy efficiency and requires less design effort

    Hardware accelerator design for data centers

    Get PDF
    As the size of available data is increasing, it is becoming inefficient to scale the computational power of traditional systems. To overcome this problem, customized application-specific accelerators are becoming integral parts of modern system on chip (SOC) architectures. In this paper, we summarize existing hardware accelerators for data centers and discuss the techniques to implement and embed them along with the existing SOCs. © 2015 IEEE

    Graph Analytics Accelerators for Cognitive Systems

    Get PDF
    Hardware accelerators are known to be performance and power efficient. This article focuses on accelerator design for graph analytics applications, which are commonly used kernels for cognitive systems. The authors propose a templatized architecture that is specifically optimized for vertex-centric graph applications with irregular memory access patterns, asynchronous execution, and asymmetric convergence. The proposed architecture addresses the limitations of existing CPU and GPU systems while providing a customizable template. The authors' experiments show that the generated accelerators can outperform a high-end CPU system with up to 3 times better performance and 65 times better power efficiency. © 1981-2012 IEEE

    Energy Efficient Architecture for Graph Analytics Accelerators

    Get PDF
    Specialized hardware accelerators can significantly improve the performance and power efficiency of compute systems. In this paper, we focus on hardware accelerators for graph analytics applications and propose a configurable architecture template that is specifically optimized for iterative vertex-centric graph applications with irregular access patterns and asymmetric convergence. The proposed architecture addresses the limitations of the existing multi-core CPU and GPU architectures for these types of applications. The SystemC-based template we provide can be customized easily for different vertex-centric applications by inserting application-level data structures and functions. After that, a cycle-accurate simulator and RTL can be generated to model the target hardware accelerators. In our experiments, we study several graph-parallel applications, and show that the hardware accelerators generated by our template can outperform a 24 core high end server CPU system by up to 3x in terms of performance. We also estimate the area requirement and power consumption of these hardware accelerators through physical-aware logic synthesis, and show up to 65x better power consumption with significantly smaller area. © 2016 IEEE

    Are olive pomace powders a safe source of bioactives and nutrients?

    Get PDF
    "First published: 10 September 2020"BACKGROUND Olive oil industry generates significant amounts of semi-solid wastes, namely the olive pomace. Olive pomace is a by-product rich in high-value compounds (e.g. dietary fibre, unsaturated fatty acids, polyphenols) widely explored to obtain new food ingredients. However, conventional extraction methods frequently use organic solvents, while novel eco-friendly techniques have high operational costs. The development of powdered products without any extraction step has been proposed as a more feasible and sustainable approach. RESULTS The present study fractionated and valorised the liquid and pulp fraction of olive pomace obtaining two stable and safe powdered ingredients, namely a liquid-enriched powder (LOPP) and a pulp-enriched powder (POPP). These powders were characterized chemically, and their bioactivity was assessed. LOPP exhibited a significant amount of mannitol (141 g/ kg), potassium (54 g/ kg) and hydroxytyrosol/ derivatives (5 mg/g). POPP exhibited high amount of dietary fibre (620 g/ kg) associated to significant amount of bound phenolics (7.41 mg GAE/ g fibre DW) with substantial antioxidant activity. POPP also contained an unsaturated fatty acids composition similar to olive oil (76\% of total fatty acids) and showed potential as a reasonable source of protein (12 \%). Their functional properties (solubility, water-holding and oil-holding capacity), antioxidant capacity and antimicrobial activity were also assessed, and their biological safety was verified. CONCLUSION The development of olive pomace powders to apply in the food industry could be a suitable strategy to add-value to olive pomace and obtain safe multifunctional ingredients with higher health-promoting effects than dietary fibre and polyphenols itself. This article is protected by copyright. All rights reserved.TBR thanks the Fundação para a Ciência e Tecnologia (FCT), Portugal for PhD grant SFRH/BDE/108271/2015 and the financial support of Association BLC3 – Technology and Innovation Campus. This work was supported by National Funds from FCT – Fundação para a Ciência e a Tecnologia through the project MULTIBIOREFINERY – SAICTPAC/0040/2015 (POCI-01-0145-FEDER-016403). We are also grateful for the scientific collaboration under the FCT project UID/Multi/50016/2019.info:eu-repo/semantics/publishedVersio
    corecore